The aim was to prepare a spatio-temporal representation of valuation studies related to biodiversity and ecosystem services and … .
To identify country names in the corpus of literature a two step approach was used. First, we wanted to understand where studies were conducted and searched the title, abstract, and keywords of each paper for country names. Second, to understand where the funding institutions were located we searched the affiliations, acknowledgments, and funding text for country names.
The input data we used are the following:
Bib file downloaded from Web of Science
ISO 3166-1 alpha-3 country codes (https://www.iso.org/iso-3166-country-codes.html)
IPBES regional and subregional area dataset (https://doi.org/10.5281/zenodo.3923633)
The python code used to georeference the corpus can be found here. An overview of the pipeline is provided in the following schematic and described below.
knitr::include_graphics("pilot2.svg")
Overview of the process of Georeferencing the corpus of valuation studies
Step 1: Extract country names from text Country names were extracted from the title, abstract, and keywords of each paper with a regular expression and the associated ISO code was added into a a column in the dataset. The same regular expression was also used to search the affiliations, acknowledgments, and funding text of the same paper and placed into a second column.
Step 2 and 3: Bundle countries in regions The IPBES Regions and Subregions datatset was then used to add additional region and subregion attributes to the dataset by matching the ISO3 code.
Step 4: Find TS accordingly Finally, we used a set of files to add additional attributes to the dataset that identified the topics. The set of files contained identifying information for papers derived from sets of web of science searches targeting particular topics. This identifying information was then matched to the corpus, and the topic extracted.
Finally, the complete corpus with the added attributes of country ISO codes of both funding institutions and research locations, and topic identification were used as the basis of the rest of the research project. The complete corpus can be found on Zenodo here: https://doi.org/[INSERT DOI]
knitr::include_graphics("Outputs/Corpus/Names1_Names2.png")
Density of studies vs density of institutions
The IPBES Core Indicators were used alongside a chosen set of other relevant indicators to understand geographic trends between density of valuation studies and how they relate to biological and socioeconomic indicators.
We used all the most recent year of the IPBES Core Indicators available within the country dataset except for two indicators, Countries/Regions with Active NBSAP and Category 1 nations in CTIES, as these are binary in the dataset and would not be compatible with the following analysis. We selected a specific category from the indicators with multiple categories. For example, for the indicator “Area of forest production under FSC and PEFC certification” we chose the FSC certification area and not the PEFC certification area.
A set of other indicators were included in the analysis to expand the coverage of socioeconomic variables. We included the human development index (HDI), average harmonized learning outcomes score, gross domestic product (GDP), corruption perception index (CPI), and population.
These datasets were downloaded, cleaned, and had ISO3 codes added to easily merge them into the analysis. The latest data available was used for each indicator.
Here is the table of all of the indicators used, the category selected, the year the data is from, and the number assigned to them.
| Name | Category | Year |
|---|---|---|
| Area of forest production under FSC and PEFC certification | FSC_area | 2016 |
| Biodiversity Habitat Index | Average | 2014 |
| Biodiversity Intactness Index | Value | 2005 |
| Biocapacity per capita | Value - Total | 2012 |
| Ecological Footprint per capita | Value - Total | 2012 |
| Forest area | Forest area (1000ha) | 2015 |
| Water Footprint | Water Footprint - Total (Mm3/y) | 2013 |
| Inland Fishery Production | Capture | 2015 |
| Region-based Marine Trophic Index | 1950 | 2014 |
| Nitrogen + Phosphate Fertilizers | N total nutrients - Consumption in nutrients | 2014 |
| Nitrogen Use Efficiency (%) | Nitrogen Use Efficiency (%) | 2009 |
| Percentage and total area covered by protected areas | Terrestrial - Protected Area (%) | 2017 |
| Percentage of undernourished people | Prevalence of undernourishment (%) (3-year average) | 2015 |
| Proportion of local breeds, classified as being at risk, not-at-risk or unknown level of risk of extinction | At Risk of Extinction | 2016 |
| PA of Key Biodiversity Areas Coverage (%) | Estimate | 2016 |
| Protected area management effectiveness | PA Assessed on Management Effectiveness (%) | 2015 |
| Protected Area Connectedness Index | Protected Area Connectedness Index | 2012 |
| Species Habitat Index | Species Habitat Index | 2014 |
| Species Protection Index (%) | Species Protection Index (%) | 2014 |
| Species Status Information Index | Value | 2014 |
| Total Wood Removals (roundwood, m3) | Total | 2014 |
| Trends in forest extent (tree cover) | Percentage of Tree Cover Loss | 2015 |
| Nitrogen Deposition Trends (kg N/ha/yr) | Nitrogen Deposition Trends (kg N/ha/yr) | 2030 |
| Trends in Pesticides Use | Use of pesticides (3-year average) | 2013 |
| Human Development Index (HDI) | NA | 2018 |
| Average harmonized learning outcomes score | NA | 2015 |
| Gross domestic product (GDP) | NA | 2019 |
| Corruption perception index (CPI) | NA | 2020 |
| Population | NA | 2018 |
There were a few instances of duplicated values which were double checked with the original dataset and the erranous value removed. Examples include having two values for USA due to the separation of Hawaii in the original dataset. In these cases Hawaii was removed and the value referring to the rest of the states of the country was used instead. Additionally, Indicator 9, Region-based Marine Trophic Index, the mean of the regions was calculated per country, as countries such as Germany have multiple regions with distinct values.
To understand how valuation is spread across geographies, we counted the number of times each country’s ISO code appeared in the corpus for both geography columns added in step 2. The result is the density of studies per country and the density of funding institutions per country for the entire corpus.
The external indicators were also joined onto the dataset to analyze the relationships between these socioeconomic indicators and the density of studies and funding institutions.
This process was also repeated with an additional filter that excluded any studies published before 2010.
To investigate the relationships between indicators and the number of valuation studies, we ran a pearson correlation analysis. The statistical analysis calculated the trends between the number of studies in each country (Names 1) and the number of studies funded in each country (Names 2) and each of the indicators. The results are shown below.
Insignificant relationships are blank, significant relationships (P < 0.01) are shown with circles. The strength of the correlation corresponds to the size of the circle and the color represents positve (blue) or negative (red) trends.
knitr::include_graphics("Outputs/Pearson_correlation_table/correlation_figure.png")
Pearson correlations of geographic valuation studies and indicators
The same pearson correlation analysis with the indicators was conducted on the log of the number of studies in each country (Names 1) and the log of the number of studies funded in each country (Names2).
knitr::include_graphics("Outputs/Pearson_correlation_table/correlation_figure_log.png")
Pearson correlations of log geographic valuation studies and the indicators
The individual trends between each indicator and the number of studies and number of funding institutions are shown here for the entire corpus.
knitr::include_graphics("Outputs/Corpus/Names1_HDI.png")
HDI vs. Density of Studies and Institutions
knitr::include_graphics("Outputs/Corpus/Names2_HDI.png")
HDI vs. Density of Studies and Institutions
knitr::include_graphics("Outputs/Corpus/Names1_GDP.png")
GDP vs. Density of Studies and Institutions
knitr::include_graphics("Outputs/Corpus/Names2_GDP.png")
GDP vs. Density of Studies and Institutions
knitr::include_graphics("Outputs/Corpus/Names1_CPI.png")
Corruption Percepation Index vs. Density of Studies and Institutions
knitr::include_graphics("Outputs/Corpus/Names2_CPI.png")
Corruption Percepation Index vs. Density of Studies and Institutions
The trend between each indicator and the density of studies and funding institutions are shown here for the entire corpus.
knitr::include_graphics("Outputs/Corpus/Names1_LearningOutcomes.png")
Learning outcomes vs. Density of Studies and Institutions
knitr::include_graphics("Outputs/Corpus/Names2_LearningOutcomes.png")
Learning outcomes vs. Density of Studies and Institutions